Deep bottleneck network based i-vector representation for language identification
نویسندگان
چکیده
This paper presents a unified i-vector framework for language identification (LID) based on deep bottleneck networks (DBN) trained for automatic speech recognition (ASR). The framework covers both front-end feature extraction and back-end modeling stages.The output from different layers of a DBN are exploited to improve the effectiveness of the i-vector representation through incorporating a mixture of acoustic and phonetic information. Furthermore, a universal model is derived from the DBN with a LID corpus. This is a somewhat inverse process to the GMM-UBM method, in which the GMM of each language is mapped from a GMM-UBM. Evaluations on specific dialect recognition tasks show that the DBN based i-vector can achieve significant and consistent performance gains over conventional GMM-UBM and DNN based i-vector methods [1][2]. The generalization capability of this framework is also evaluated using DBNs trained on Mandarin and English corpuses.
منابع مشابه
Named Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملLID-senone Extraction via Deep Neural Networks for End-to-End Language Identification
A key problem in spoken language identification (LID) is how to effectively model features from a given speech utterance. Recent techniques such as end-to-end schemes and deep neural networks (DNNs) utilising transfer learning such as bottleneck (BN) features, have demonstrated good overall performance, but have not addressed the extraction of LID-specific features. We thus propose a novel end-...
متن کاملExploiting Hidden-Layer Responses of Deep Neural Networks for Language Recognition
The most popular way to apply Deep Neural Network (DNN) for Language IDentification (LID) involves the extraction of bottleneck features from a network that was trained on automatic speech recognition task. These features are modeled using a classical I-vector system. Recently, a more direct DNN approach was proposed, it consists of estimating the language posteriors directly from a stacked fra...
متن کاملA deep-learning based native-language classification by using a latent semantic analysis for the NLI Shared Task 2017
This paper proposes a deep-learning based native-language identification (NLI) using a latent semantic analysis (LSA) as a participant (ETRI-SLP) of the NLI Shared Task 2017 (Malmasi et al., 2017) where the NLI Shared Task 2017 aims to detect the native language of an essay or speech response of a standardized assessment of English proficiency for academic purposes. To this end, we use the six ...
متن کاملAn analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition
Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discriminate between phonetic units, i.e. trained for ...
متن کامل